Sneaker Finder v2.0 - Fast Sneaks
Learning the fastai API by refactoring an old tf/keras project
OVERVIEW
This is a project initiated while an Insight Data Science fellow. It grew out of my interest in making data driven tools in the fashion/retail space I had most recently been working. The original over-scoped idea was to make a shoe desighn tool which could quickly develop some initial sneakers based on choosing some examples, and some text descriptors. Designs are constrained by the "latent space" defined (discovered?) by a database of shoe images. However, given the 3 week sprint allowed for development, I pared the tool down to a simple "aesthetic" recommender for sneakers, using the same idea of utilizing an embedding space defined by the database fo shoe images.
Part 0: DATA
The data has been munged... link to details here [01_data.ipynb]
Part 3: ResNet feature extractor
embed database into feature space.
evaluate by simple logistic regression on classification.
filename = "zappos-50k-simplified_sort"
df = pd.read_pickle(f"data/{filename}.pkl")
import torchvision
Because we simply want to collect the features output from the model rather than do classification (or some other decision) I replaced the clasiffier head with a simple identity mapper. The simple Identity nn.Module class makes this simple.
Finally, since we are calculating the features, or embedding over 30k images with the net lets load the computations onto our GPU. We need to remember to do this in evaluation mode so batch Norm and dropout layers are disabled. [I forgot to do this initally and lost hours trying to figure out why i wasn't getting consistent results]. Setting param.requires_grad = False saves us memory since we aren't going to fit any weights for now, and protects us in case we forget to do a with torch.no_grad() before inference.
ASIDE: I'm running the compute on what I call my data-pizza oven: a linux machine loaded wiht a powerful CPU a cheap (but powerful GPU), and a bunch of memory in the gutted shell of an old PowerMac G5 case. (I picked up at a garge sale for $25 bucks! I call it the BrickOven Toaster. Check it out [here]
Later when we use the full FastAI API this should all be handled elegantly behind the scenes
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device
def get_ResNet_feature_net(to_cuda=False):
# following the pattern for MnetV2 but could use the fastai resnet instead (just need to remove fc)
resnet = torchvision.models.resnet50(pretrained=True)
num_ftrs = resnet.fc.in_features
print(num_ftrs)
resnet.fc = Identity()
if to_cuda:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
else:
device = torch.device("cpu")
resnet = resnet.to(device)
resnet.eval()
# just incase we forget the no_grad()
for param in resnet.parameters():
param.requires_grad = False
return resnet
rnet = get_ResNet_feature_net(to_cuda=True)
rnet
batch_size = 128
def get_x(r): return path_images/r['path']
#def get_y(r): return r['Category'] # we aren't actually using the category here (see 02_model.ipynb)
def get_fname(r): return r['path']
def get_dls(data,batch_size, size, device):
# put everythign in train, and don't do any augmentation since we are just going
# resize to resize and normalize to imagenet_stats
dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
splitter=IndexSplitter([]),
get_x=get_x,
get_y=get_fname,
item_tfms=Resize(size, method='pad', pad_mode='border'),
batch_tfms=Normalize.from_stats(*imagenet_stats)) # border pads white...
dls = dblock.dataloaders(data,bs=batch_size,drop_last=False,device=device)
#since we are just calculating the features for all the data turn off shuffling
dls.train.shuffle=False
return dls
def get_all_feats(dls,conv_net):
vects = []
clss = []
paths = []
batchn = 0
for imgs,classes in dls.train:
with torch.no_grad():
outs = conv_net(imgs)
vects.extend(list(outs.data.cpu().numpy()))
cs = classes.data.cpu().numpy()
clss.extend(list(cs))
ps = [dls[0].vocab[c] for c in cs]
# keep the paths for sanity check
paths.extend(ps)
batchn += 1
#store all relevant info in a pandas datafram
df_feats = pd.DataFrame({"path": paths, "classes":clss, "features":vects})
return df_feats
for i,sz in enumerate(IMG_SIZES):
print(IMG_SIZES[sz])
dls = get_dls(df,batch_size,IMG_SIZES[sz],device)
df_f = get_all_feats(dls,rnet)
# save it
filename = f"resnet50-features_{sz}"
df_f.to_pickle(f"data/{filename}.pkl")
filename = f"resnet50-features_small"
df_sm = pd.read_pickle(f"data/{filename}.pkl")
filename = f"resnet50-features_medium"
df_md = pd.read_pickle(f"data/{filename}.pkl")
filename = f"resnet50-features_large"
df_lg = pd.read_pickle(f"data/{filename}.pkl")
df_test = pd.merge(df_sm,df_md,how='left',on='path',suffixes=('_sm','_md'))
df_test = pd.merge(df_test,df_lg,how='left',on='path')
df_test = df_test.rename(columns={"classes": "classes_lg", "features": "features_lg"})
# explicitly:
df2 = pd.merge(df, df_test, how='left', on='path')
filename = "zappos-50k-resnet50-features_"
df2.to_pickle(f"data/{filename}.pkl")
df2 = df2.sort_values('path', ascending=True)
df2 = df2.reset_index(drop=True)
df2.head(3)
filename = "zappos-50k-resnet50-features_sort_3"
df2.to_pickle(f"data/{filename}.pkl")
df = df2
If we've already calculated everything just load it.
query_image = "Shoes/Sneakers and Athletic Shoes/Nike/7716996.288224.jpg"
query_ind = df[df["path"]==query_image].index
#df[df['path']==query_image]
df.loc[query_ind,['path','classes_sm']]
The DataBlock performed a number of processing steps to prepare the images for embedding into the MobileNet_v2 space (1280 vector). Lets confirm that we get the same image and MobileNet_v2 features.
base_im = PILImage.create(path_images/query_image)
#BUG: pass split_idx=1 to avoid funny business
img = Resize(IMG_SIZE, method='pad', pad_mode='border')(base_im, split_idx=1)
t2 = ToTensor()(img)
t2 = IntToFloatTensor()(t2)
t2 = torchvision.transforms.Normalize(*imagenet_stats)(t2)
t2.shape
That seemed to work well. I'll just wrap it in a simple function for now, though a FastAI Pipeline might work the best in the long run.
def load_and_prep_sneaker(image_path,size=IMG_SIZE,to_cuda=False):
"""input: expects a Path(), but string should work
output TensorImage ready to unsqueeze and "embed"
TODO: make this a Pipeline?
"""
base_im = PILImage.create(image_path)
#BUG: pass split_idx=1 to avoid funny business
img = Resize(size, method='pad', pad_mode='border')(base_im, split_idx=1)
t2 = ToTensor()(img)
t2 = IntToFloatTensor()(t2)
t2 = torchvision.transforms.Normalize(*imagenet_stats)(t2)
if to_cuda:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
else:
device = torch.device("cpu")
return t2.to(device)
path_images/query_image
def get_convnet_feature(cnet,t_image,to_cuda=False):
"""
input:
cnet - our neutered & prepped (resnet or MobileNet_v2)
t_image - ImageTensor. probaby 3x224x224... but could be a batch
to_cuda - send to GPU? default is CPU (to_cuda=False)
output:
features - output of mnetv2vector n-1280
"""
# this is redundant b ut safe
if to_cuda:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
else:
device = torch.device("cpu")
cnet = cnet.to(device)
t_image.to(device)
if len(t_image.shape)<4:
t_image = t_image.unsqueeze(0)
with torch.no_grad():
features = cnet(t_image)
return features
query_image2 = '/home/ergonyc/Downloads/491212_01.jpg.jpeg'
query_t = load_and_prep_sneaker(path_images/query_image)
#test_feats = get_mnet_feature(mnetv2,query_t)
test_feats = get_resnet_feature(rnet,query_t)
test_feats.shape
Now I have the "embeddings" of the database in the mobileNet_v2 output space. I can do a logistic regression on these vectors (should be identical to mapping these 1000 vectors to 4 categories (Part 3)) but I can also use an approximate KNN in this space to run the SneakerFinder tool.
Next up:
- make KNN functions.. maybe aproximate KNN e.g. Annoy for speed. Or precalculate .
- PCA / tSNE / UMAP the space with categories to visualize embedding
- make widgets to make this an actual tool / API
Lets find the nearest neighbors as a proxy for "similar"
I'll start with a simple "gut" test, and point out that thre realy isn't a ground truth to refer to. Remember that the goal of all this is to find some shoes that someone will like, and we are using "similar" as the aproximation of human preference.
Lets use our previously calculated sneaker-features and inspect that the k- nearest neighbors in our embedding space are feel or look "similar".
Personally, I like Jordans so I chose this as my query_image: 
from sklearn.neighbors import NearestNeighbors
import umap
def get_umap_reducer(latents):
reducer = umap.UMAP(random_state=666)
reducer.fit(latents)
return reducer
num_neighs = 5
knns = []
reducers = []
for i,sz in enumerate(IMG_SIZES):
print(ABBR[sz])
print(IMG_SIZES[sz])
features = f"features_{ABBR[sz]}"
print(features)
db_feats = np.vstack(df[features].values)
neighs = NearestNeighbors(n_neighbors=num_neighs) #add plus one in case image exists in database
neighs.fit(db_feats)
knns.append(neighs)
reducer = get_umap_reducer(db_feats)
reducers.append(reducer)
Lets take a quick look at the neighbors according to our list:
neighs = knns[0]
distance, nn_index = neighs.kneighbors(test_feats, return_distance=True)
dist = distance.tolist()[0]
df.columns
paths = df[['path','classes_sm','classes_md','classes_lg']]
neighbors = paths.iloc[nn_index.tolist()[0]].copy()
images = [ PILImage.create(path_images/f) for f in neighbors.path]
#PILImage.create(btn_upload.data[-1])
for im in images:
display(im.to_thumb(IMG_SIZE,IMG_SIZE))
# img_row = df['path'].values[nn_index[0]]
# img_row = np.insert(img_row, 0, query_image, axis=0)
type(neighs)
def query_neighs(q_feat, myneighs, data, root_path, show = True):
"""
query feature: (vector)
myneighs: fit knn object
data: series or df containing "path"
root_path: path to image files
"""
distance, nn_index = myneighs.kneighbors(q_feat, return_distance=True)
dist = distance.tolist()[0]
# fix path to the database...
neighbors = data.iloc[nn_index.tolist()[0]].copy()
images = [ PILImage.create(root_path/f) for f in neighbors.path]
#PILImage.create(btn_upload.data[-1])
if show:
for im in images: display(im.to_thumb(IMG_SIZE,IMG_SIZE))
return images
feature_func = get_resnet_feature
similar_images = []
for i,sz in enumerate(IMG_SIZES):
print(ABBR[sz])
print(IMG_SIZES[sz])
features = f"features_{ABBR[sz]}"
print(features)
query_t = load_and_prep_sneaker(path_images/query_image,IMG_SIZES[sz])
#query_f = get_convnet_feature(mnetv2,query_t)
query_f = get_convnet_feature(rnet,query_t)
similar_images.append( query_neighs(query_f, knns[i], paths, path_images, show=False) )
im = PILImage.create(path_images/query_image)
display(im.to_thumb(IMG_SIZES[sz]))
def plot_sneak_neighs(images):
''' function to plot matrix of image urls.
image_urls[:,0] should be the query image
Args:
images: list of lists
return:
null
saves image file to directory
'''
nrow = len(images)
ncol = len(images[0])
fig = plt.figure(figsize = (20, 20))
num=0
for row,image_row in enumerate(images):
for col,img in enumerate(image_row):
plt.subplot(nrow, ncol, num+1)
plt.axis('off')
plt.imshow(img);
if num%ncol == 0:
plt.title('Query')
if col>0:
plt.title('Neighbor ' + str(col))
num += 1
plt.savefig('image_search.png')
plt.show()
plot_sneak_neighs(similar_images)
similar_images2 = []
for i,sz in enumerate(IMG_SIZES):
print(ABBR[sz])
print(IMG_SIZES[sz])
features = f"features_{ABBR[sz]}"
print(features)
query_t = load_and_prep_sneaker(path_images/query_image2,IMG_SIZES[sz])
#query_f = get_convnet_feature(mnetv2,query_t)
query_f = get_convnet_feature(rnet,query_t)
similar_images2.append( query_neighs(query_f, knns[i], paths, path_images, show=False) )
im = PILImage.create(path_images/query_image2)
display(im.to_thumb(IMG_SIZES[sz]))
plot_sneak_neighs(similar_images2)
df.columns
import seaborn as sns
from sklearn.decomposition import PCA
import umap
# first simple PCA
pca = PCA(n_components=2)
for i,sz in enumerate(IMG_SIZES):
print(ABBR[sz])
print(IMG_SIZES[sz])
features = f"features_{ABBR[sz]}"
print(features)
data = df[['Category',features]].copy()
db_feats = np.vstack(data[features].values)
# PCA
pca_result = pca.fit_transform(db_feats)
data['pca-one'] = pca_result[:,0]
data['pca-two'] = pca_result[:,1]
print(f"Explained variation per principal component (sz{sz}): {pca.explained_variance_ratio_}")
smpl_fac=.5
#data=df.reindex(rndperm)
plt.figure(figsize=(16,10))
sns.scatterplot(
x="pca-one",
y="pca-two",
hue="Category",
palette=sns.color_palette("hls", 4),
data=data.sample(frac=smpl_fac),
legend="full",
alpha=0.3
)
plt.savefig(f'PCA 2-D sz{sz}')
plt.show()
# get the UMAP on deck
embedding = reducers[i].transform(db_feats)
data['umap-one'] = embedding[:,0]
data['umap-two'] = embedding[:,1]
plt.figure(figsize=(16,10))
sns.scatterplot(
x="umap-one",
y="umap-two",
hue="Category",
palette=sns.color_palette("hls", 4),
data=data.sample(frac=smpl_fac),
legend="full",
alpha=0.3
)
plt.gca().set_aspect('equal', 'datalim')
plt.title(f'UMAP projection of mobileNetV2 embedded UT-Zappos data (sz{sz})', fontsize=24)
plt.savefig('UMAP 2-D sz{sz}')
plt.show()
def get_umap_embedding(latents):
reducer = umap.UMAP(random_state=666)
reducer.fit(latents)
embedding = reducer.transform(latents)
assert(np.all(embedding == reducer.embedding_))
return embedding
fn = df.path.values
type(db_feats)
snk2vec = dict(zip(fn,db_feats))
snk2vec[list(snk2vec.keys())[0]]
embedding = get_umap_embedding(db_feats)
snk2umap = dict(zip(fn,embedding))
btn_run = widgets.Button(description='Find k-nearest neighbors')
out_pl = widgets.Output()
lbl_neighs = widgets.Label()
btn_upload = widgets.FileUpload()
def _load_image(im):
"""input: expects a Path(), but string should work, or a Bytestring
returns: resized & squared image
"""
#image = PILImage.create(btn_upload.data[-1])
image = PILImage.create(im)
#BUG: pass split_idx=1 to avoid funny business
image = Resize(IMG_SIZE, method='pad', pad_mode='border')(image, split_idx=1)
return image
def _prep_image(image,to_cuda=False):
"""input: squared/resized PIL image
output TensorImage ready to unsqueeze and "embed"
TODO: make this a Pipeline?
"""
t2 = ToTensor()(image)
t2 = IntToFloatTensor()(t2)
t2 = torchvision.transforms.Normalize(*imagenet_stats)(t2)
if to_cuda:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
else:
device = torch.device("cpu")
return t2.to(device)
#img = _load_img(im).flip_lr()
conv_net = rnet
def on_click_find_similar(change):
""" """
im = btn_upload.data[-1]
img = _load_image(im)
tensor_im = _prep_image(img,to_cuda=False)
feats = get_convnet_feature(conv_net, tensor_im )
distance, nn_index = neighs.kneighbors(feats.numpy(), return_distance=True)
dist = distance.tolist()[0]
# fix path to the database...
neighbors = df.iloc[nn_index.tolist()[0]].copy()
#neighbors.loc[:,'db_path'] = neighbors.loc[:,'path'].astype(str).copy()
nbr = neighbors.index
out_pl.clear_output()
#with out_pl: display(plot_sneak_neighs(img_row[np.newaxis,:])) # need to convert to pil...
images = [ PILImage.create(path_images/f) for f in neighbors.path]
#PILImage.create(btn_upload.data[-1])
with out_pl:
display(img.to_thumb(200,200))
for i in images:
display(i.to_thumb(100,100))
lbl_neighs.value = f'distances: {dist}'
btn_run.on_click(on_click_find_similar)
widgets.VBox([widgets.Label('Find your sneaker!'),
btn_upload, btn_run, out_pl, lbl_neighs])
# import time
# # import matplotlib.pyplot as pltmodel
# import matplotlib.image as mpimg
# import matplotlib.pyplot as plt
# from mpl_toolkits.mplot3d import Axes3D
# import plotly
# import plotly.express as px
# import plotly.figure_factory as FF
import bokeh.plotting as bplt #import figure, show, output_notebook
#from bokeh.models import HoverTool, ColumnDataSource, CategoricalColorMapper
import bokeh
# from bokeh.palettes import Spectral10
import umap
#from scipy import spatial #for now just brute force to find neighbors
import scipy
#from scipy.spatial import distance
from io import BytesIO
import base64
########################################3
# BOKEH
#
##########################################3
def init_bokeh_plot(umap_df):
bplt.output_notebook()
datasource = bokeh.models.ColumnDataSource(umap_df)
color_mapping = bokeh.models.CategoricalColorMapper(factors=["sns","goat"],
palette=bokeh.palettes.Spectral10)
plot_figure = bplt.figure(
title='UMAP projection VAE latent',
plot_width=1000,
plot_height=1000,
tools=('pan, wheel_zoom, reset')
)
plot_figure.add_tools(bokeh.models.HoverTool(tooltips="""
<div>
<div>
<img src='@image' style='float: left; margin: 5px 5px 5px 5px'/>
</div>
<div>
<span style='font-size: 14px'>@fname</span>
<span style='font-size: 14px'>@loss</span>
</div>
</div>
"""))
plot_figure.circle(
'x',
'y',
source=datasource,
color=dict(field='db', transform=color_mapping),
line_alpha=0.6,
fill_alpha=0.6,
size=4
)
return plot_figure
def embeddable_image(label):
return image_formatter(label)
def get_thumbnail(path):
i = Image.open(path)
i.thumbnail((64, 64), Image.LANCZOS)
return i
def image_base64(im):
if isinstance(im, str):
im = get_thumbnail(im)
with BytesIO() as buffer:
im.save(buffer, 'png')
return base64.b64encode(buffer.getvalue()).decode()
def image_formatter(im):
return f"data:image/png;base64,{image_base64(im)}"
# do we need it loaded... it might be fast enough??
#@st.cache
def load_UMAP_data():
data_dir = f"data/{model_name}-X{params['x_dim'][0]}-Z{params['z_dim']}"
load_dir = os.path.join(data_dir,f"kl_weight{int(params['kl_weight']):03d}")
snk2umap = ut.load_pickle(os.path.join(load_dir,"snk2umap.pkl"))
return snk2umap
def load_latent_data():
data_dir = f"data/{model_name}-X{params['x_dim'][0]}-Z{params['z_dim']}"
snk2umap = load_UMAP_data()
# load df (filenames and latents...)
mids = list(snk2vec.keys())
vecs = np.array([snk2vec[m] for m in mids])
vec_tree = scipy.spatial.KDTree(vecs)
latents = np.array(list(snk2vec.values()))
losses = np.array(list(snk2loss.values()))
labels = np.array(mids)
labels2 = np.array(list(snk2umap.keys()))
embedding = np.array(list(snk2umap.values()))
assert(np.all(labels == labels2))
umap_df = pd.DataFrame(embedding, columns=('x', 'y'))
umap_df['digit'] = [str(x.decode()) for x in labels]
umap_df['image'] = umap_df.digit.map(lambda f: embeddable_image(f))
umap_df['fname'] = umap_df.digit.map(lambda x: f"{x.split('/')[-3]} {x.split('/')[-1]}")
umap_df['db'] = umap_df.digit.map(lambda x: f"{x.split('/')[-3]}")
umap_df['loss'] = [f"{x:.1f}" for x in losses]
return umap_df,snk2vec,latents, labels, vecs,vec_tree,mids
#%%
# pca_result = pca.fit_transform(df['feats'].values.tolist())
# df['pca-one'] = pca_result[:,0]
# df['pca-two'] = pca_result[:,1]
# df['pca-three'] = pca_result[:,2]
# print('Explained variation per principal component: {}'.format(pca.explained_variance_ratio_))
# #data=df.sample(frac=1.0)
# #data=df.reindex(rndperm)
# data = df
# #df_subset = df
# time_start = time.time()
# tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)
# tsne_results = tsne.fit_transform(db_feats)
# print('t-SNE done! Time elapsed: {} seconds'.format(time.time()-time_start))
# df['tsne-2d-one'] = tsne_results[:,0]
# df['tsne-2d-two'] = tsne_results[:,1]
# plt.figure(figsize=(16,10))
# sns.scatterplot(
# x="tsne-2d-one", y="tsne-2d-two",
# hue="CategoryDir",
# palette=sns.color_palette("hls", 4),
# data=df,
# legend="full",
# alpha=0.3
#)
# import matplotlib.image as mpimg
# import random
# from PIL import Image
# import requests
# from io import BytesIO
from sklearn.metrics import confusion_matrix
from seaborn import heatmap
from sklearn.linear_model import LogisticRegression
#Display Confusion Matrix
X_test = np.vstack(df[df.t_t_v=='test']['features'])
y_test = np.vstack(df[df.t_t_v=='test']['Category'])
X_train = np.vstack(df[df.t_t_v=='train']['features'])
y_train = np.vstack(df[df.t_t_v=='train']['Category'])
clf_log = LogisticRegression(C = 1, multi_class='ovr', max_iter=2000, solver='lbfgs')
clf_log.fit(X_train, y_train)
log_score = clf_log.score(X_test, y_test)
log_ypred = clf_log.predict(X_test)
log_confusion_matrix = confusion_matrix(y_test, log_ypred)
print(log_confusion_matrix)
disp = heatmap(log_confusion_matrix, annot=True, linewidths=0.5, cmap='Blues')
plt.savefig('log_Matrix.png')
plt.figure(figsize=(16,16))
# Plot non-normalized confusion matrix
titles_options = [("Confusion matrix, without normalization", None),
("Normalized confusion matrix", 'true')]
class_names = df.Category.unique()
from sklearn.metrics import plot_confusion_matrix
for title, normalize in titles_options:
disp = plot_confusion_matrix(clf_log, X_test, y_test,
display_labels=class_names,
cmap=plt.cm.Blues,
normalize=normalize)
disp.ax_.set_title(title)
print(title)
print(disp.confusion_matrix)
plt.savefig('log_Matrix2.png')
def get_x(r): return path_images/r['path']
def get_y(r): return r['Category']
def splitter(df):
train = df.index[df['train']].tolist()
valid = df.index[df['validate']].tolist()
return train,valid
# splitter=RandomSplitter(valid_pct=0.3,seed=42),
# get_x=get_x,
# get_y=get_y,
# #item_tfms=Resize(224)
# #item_tfms = RandomResizedCrop(224,min_scale=0.95)
# )
# dls = dblock.dataloaders(df)
doc(DataBlock)
imagenet_stats
batch_tfms=Normalize.from_stats(*imagenet_stats)
tfms = aug_transforms(mult=1.0,
do_flip=True,
flip_vert=False,
max_rotate=5.0,
min_zoom=1.0,
max_zoom=1.05,
max_lighting=0.1,
max_warp=0.05,
p_affine=0.75,
p_lighting=0.0,
xtra_tfms=None,
size=None,
mode='bilinear',
pad_mode='reflection',
align_corners=True,
batch=False,
min_scale=1.0)
# put everythign in train, and don't do any augmentation since we are just going
# resize to 160
dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
splitter=splitter,
get_x=get_x,
get_y=get_y,
item_tfms=Resize(160,method='pad', pad_mode='border'),
batch_tfms=tfms) # border pads white...
dls = dblock.dataloaders(df,bs=64,drop_last=False)
models.mobilenet_v2()._modules.items[1]
mobilenet_split = lambda m: (m[0][0][10], m[1])
learn = cnn_learner(dls, models.mobilenet_v2, splitter=mobilenet_split,cut=-1, pretrained=True,metrics=error_rate)
#learn = cnn_learner(dls, model_conv, splitter=mobilenet_split,cut=-1, pretrained=True)
lr_min,lr_steep = learn.lr_find()
lr_min, lr_steep
doc(learn.fine_tune)
learn.predict(dls.dataset[10][0])
learn.fine_tune()
learn.fit_one_cycle(6, lr_max=1e-5)
learn.recorder.plot_loss()
model_conv = torchvision.models.mobilenet_v2(pretrained=True)
for param in model_conv.parameters():
param.requires_grad = False
# Parameters of newly constructed modules have requires_grad=True by default
# just read this off: model_conv.classifier
num_categories = 4
num_ftrs = model_conv.classifier._modules['1'].in_features
model_conv.classifier._modules['1'] = nn.Linear(num_ftrs, num_categories)
def trns_mobilenet_v2():
model_conv = torchvision.models.mobilenet_v2(pretrained=True)
for param in model_conv.parameters():
param.requires_grad = False
# Parameters of newly constructed modules have requires_grad=True by default
# just read this off: model_conv.classifier
num_ftrs = model_conv.classifier._modules['1'].in_features
model_conv.classifier._modules['1'] = nn.Linear(num_ftrs, num_categories)
return model_conv
mnetV2 = torchvision.models.mobilenet_v2()
import torchvision
from torchvision import models
# def _mobilenetv2_split(m:nn.Module):
# return (m[0][0][10],m[1])(m:nn.Module): return (m[0][0][10],m[1])
mobilenet_split = lambda m: (m[0][0][10], m[1])
#arch = torchvision.models.mobilenet_v2
model_conv = models.mobilenet_v2(pretrained=True)
#learn = cnn_learner(dls, models.mobilenet_v2, cut=-1, pretrained=True)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model_conv = model_conv.to(device)